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ABSTRACT 



Recent technological advances in the design and manufacturing of night vision 
multispectral sensors now allow spatially registered imagery provided by each of the 
sensors to be combined within a single fused image for display to an end user. The 
product is a multispectral false-colored rendering of the imaged scene. The use of false 
color in fused imagery may facilitate object recognition, providing contour information of 
the objects present in the scene, but incongruently colored fused imagery may be 
disruptive of perceptual performance. This study investigated if the use of false color 
imagery compared to natural color imagery was helpful or not in object recognition. 
Subjects' reaction times (RTs) and error rates were measured in a standard naming task. 
Stimuli consisted of photographs of food objects that had been manipulated in color 
(natural color, false color, natural grayscale, and false grayscale) and noise (three levels). 
The results of the experiment showed similar differences in RTs between color images 
(natural or false) and their grayscale counterparts at different levels of noise, indicating 
that both color conditions were similarly helpful in object recognition. These results give 
an indication that false color may be useful in multispectral sensors based on its 
facilitation of image segmentation with shape degraded images. 
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EXECUTIVE SUMMARY 



Current night vision devices (NVD’s) used in military operations, such as night 
vision goggles (NVG) and forward looking infrared (FLIR) systems, were designed to 
allow operations in low visibility conditions. New military tactics require demanding 
capabilities that current NVD’s are just partially able to accomplish. 

Infrared (IR) systems sense radiated energy detecting thermal differences between 
an object and its background. Image intensifier (I^) sensors amplify reflected moonlight 
and starlight taking advantage of nighttime illumination conditions. Because of this 
response to widely separated wavebands within the electromagnetic (EM) spectrum, each 
sensor suffers disadvantages that the other does not, which can change depending on the 
atmospheric and environmental conditions. But, nevertheless, both current sensing 
modalities seem to be complementary. Accordingly, fusing the imagery originated in 
these two complementary sensors into a single display may result in equal or better 
operator performance compared to the two single band sensor imagery alone. This 
technique is known as dual-band sensor fusion. 

Currently there is no field capacity to combine the best attributes of both sensors 
into a single fused image. Recent experimental advances in sensoring and data display 
have permitted good progress in real time image fusion and display of multispectral 
sensors in either monochrome or synthetic chromatic form. 

The image processing challenge is to generate an intuitively meaningful color 
image on a display for a human viewer. Algorithms to perform this function in an 
optimum manner are currently under development. Since neither sensor is in the visible 
waveband, the artificial color mappings produced by some fusion algorithms will 
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generally produce false-color imagery whose chromatic characteristics do not correspond 
in any intuitive or obvious way to those of a scene viewed under natural photopic 
illumination. To the degree that human perception relies on stored knowledge of objects’ 
chromatic characteristics, false color images may be disruptive of perceptual 
performance, making colored sensor fusion unhelpful in object recognition. 

The reason for using color in fused imagery is based on the assumption that color 
(natural or artificial) facilitates image segmentation, providing contour information about 
the individual objects present in that scene as a way to achieve target detection. Past 
psychophysical research has been equivocal in determining what utility, if any, sensor 
fusion has for human performance. Research in this field has been inconclusive due to 
differing experimental methodologies used in these studies. 

In order to measure the effectiveness of sensor fusion devices in enhancing the 
night capabilities of military operators over currently employed systems detailed 
exploration in the area of human factors was required. 

The objective of this thesis was to quantitatively assess the role of natural and 
artificial color in object recognition when shape information is degraded, investigating 
whether and how false color might be useful in multi-band fused imagery. Digital 
photographs of natural objects (fhiits and vegetables) were presented as natural and false 
color images, together with their gray scale counterparts, degraded by different levels of 
noise, and compared these imges in a standard naming task, trying to emulate imagery 
generated by multispectral devices. Two precise measures of visual ability that are 
critical to the military, reaction time (RT) and rate of accuracy in target detection, were 
measured. 
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Natural color might facilitate object recognition in either or both of two ways; by 
facilitating scene segmentation and by allowing access to stored color knowledge. In the 
presence of false colored images, recognition might be disrupted, because the access to 
stored knowledge is denied and participants would rely just in color contrast as a way to 
reach object recognition through scene segmentation. It was hypothesized that shorter 
RTs and greater accuracy rates would occur within the natural color images across all 
levels of noise and the difference in RTs between natural color and false color images 
would be largest in the conditions with the greatest amount of noise. The longest RTs and 
greatest error rates were expected within the grayscale images, because participants 
would not be able either to accomplish scene segmentation or to access stored knowledge 
during the object recognition task. Intermediate results would be achieved by false color 
images, due to the possibility at least to fulfill scene segmentation, originated by the 
presence of color. 

As a result of the analysis conducted trying to assess the benefit of using color in 
object recognition, it can be concluded that both natural and false hue conditions resulted 
equally beneficial in the task accomplished during the experiment. There was no 
evidence of false color as a disruptive factor during this task, and both natural and false 
hue were similarly useful at different levels of image degradation. The reason for this 
conclusion is based on the assumption that participants conducted a bottom-up process 
during the object recognition task, making use of color (natural or false) to achieve image 
segmentation. 
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I. 



INTRODUCTION 



Current night vision devices (NVDs) used in military operations, such as night 
vision goggles (NVG) and forward looking infrared (FLIR) systems, were designed to 
allow operations in low-visibility conditions. New military tactics require demanding 
capabilities that current NVDs are just partially able to accomplish. Greater target 
discrimination from decoys and background clutter is needed, together with greater 
display resolution, adequate magnification properties, and larger fields of view (Krebs, 
Scribner, Miller, Ogawa & Schuler, 1998, Marine Corps CDC, 1995). By combining 
information from multiple single-band sources within a unitary display, researchers hope to 
overcome perceptual limitations inherent in the images provided by various electro-optical 
sensors singly (Sinai, McCarley, Krebs & Essock, 1999a). 

Infrared (IR) systems sense radiated energy detecting thermal differences between 
an object and its background. Image intensifier (I^) sensors amplify reflected moonlight 
and starlight by taking advantage of nighttime illumination conditions. Because of this 
response to widely separated wavebands within the electromagnetic (EM) spectrum, each 
sensor maintains and suffers disadvantages that the other does not, which can change 
depending on the atmospheric and environmental conditions (Sinai et al., 1999a). For 
example, resolution is better in I^ sensors, but contrast between heat-emitting objects and 
their surroundings can be better determined by IR sensors (Sinai et al., 1999a). 

Limitations in each of the sensing modalities can sometimes be disorienting by creating 
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visual illusions (Crowley, Rash & Stephens, 1992), while alternating between these 
modalities can be difficult, confusing and distracting (Rabin & Wiley, 1994). 

Nevertheless, both current sensing modalities seem to be complementary. Accordingly, 
fusing the imagery originated in these two complementary sensors into a single display 
may result in equal to or better operator performance compared to the two single-band 
sensor imagery alone. This technique, known as dual-band sensor fusion, could also 
provide scene information not present in either single band image alone by deriving 
emergent information based on the difference between the sensors (Sinai et al., 1999a). 

The contrast available in a fused image is often displayed as a monochrome or gray 
scale image (Therrien, Scrofani & Krebs, 1997; Peli, Peli, Ellis & Stahl, 1999). 

Techniques developed to introduce synthetic color to fused imagery (Scribner, Satyshur, 
Schuler & Kruer, 1996; Waxman, Gove, Seibert, Fay, Carrick, Racamato, Savoye, Burke, 
Reich, McGonagle & Craig, 1996; Scribner, Warren, Schuler, Satyshur & Kruer, 1998; 
Krebs, McCarley, Kozek, Miller, Sinai & Werblin, 1999a), attempting to provide additional 
information through color contrast, are examples of the emergent information originated 
by sensor fusion. 

For a human operator, the multiple sources of imagery need to be fused and 
displayed in a form that is easy and natural to interpret, improving the operator 
performance (Peli et al., 1999). Currently there is no field capacity to combine the best 
attributes of both sensors into a single fused image. Recent experimental advances in 
sensoring and data display have permitted good progress in real time image fusion and 
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display of multispectral sensors in either monochrome or synthetic chromatic form 
(McDaniel, Scribner, Krebs, Warren, Ockman & McCarley, 1998). 

The need is for new image processing techniques to combine the multispectral 
images so that the resultant image will have more information content than any of the 
original images, as it has been demonstrated by several researchers (Scribner et al., 1998; 
Krebs et al., 1999a; Therrien et al., 1997; Waxman, Aguilar, Fay, Ireland, Racamato, 

Ross, Carrick, Gaove, Seibert, Saboye, Reich, Burke, McGonagle & Craig, 1998). This 
requires studies in data formatting such as color-coding or object enhancements (e.g., 
towers hanging or power line for obstacle avoidance) (McDaniel et al., 1998). The image 
processing challenge is to generate an intuitively meaningful color image on a display for a 
human viewer that should improve the operator performance, facilitating discrimination of 
objects from backgrounds and situational awareness by means of scene segmentation. 

Past psychophysical research has been equivocal in determining what utility, if any, 
sensor fusion has for human performance. While some studies have found a significant 
advantage for fused imagery over single sensor imagery (Essock et al., 1999a; Toet, 
Ijspeert, Waxman & Aguilar, 1997; Waxman et al., 1996), others have not (Steele and 
Perconti, 1997; Krebs et al., 1998; Essock, Sinai, McCarley, DeFord & Srinivasan, 

1999b). These discrepant results can be attributed to the differences in fusion algorithms 
tested, and to the differences in the psychophysical tasks employed (Essock et al., 1999b). 
It is not so obvious that sensor fusion is going to be beneficial for perceptual performance 
(Sinai, McCarley & Krebs, 1999b). 
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Since neither sensor is in the visible waveband, the artificial color mappings 
produced by some fusion algorithms will generally produce false-color imagery whose 
chromatic characteristics do not correspond in any intuitive or obvious way to those of a 
scene viewed under natural photopic illumination. To the degree that human perception 
relies on stored knowledge of objects’ chromatic characteristics, false color images may be 
disruptive of perceptual performance (Sinai et al., 1999b), making colored sensor fusion 
unhelpful in object recognition. 

The reason for using color in fused imagery is based on the assumption that color 
(natural or artificial) facilitates image segmentation (Walls, 1942), providing contour 
information about the individual objects present in that scene as a way to achieve target 
detection. It should also be considered that the role of color in object recognition has not 
been determined clearly enough either. Past research in this field has been inconclusive 
due to differing experimental methodologies used in these studies. Several tasks and 
different types of stimuli were presented to the participants. Observers were required to 
recognize or identify natural or manufactured objects presented as colored or achromatic 
photographs, line drawings, artificially colored photographs, etc., using noise or blur as 
image degrading factors as a way to simulate poor resolution conditions (Wurm, Legge, 
Isenberg & Luebker, 1993; Ostergaard and Davidoflf, 1985, Biederman and Ju, 1988; 
Joseph and Proffitt, 1996). 

In order to measure the effectiveness of sensor fusion devices in enhancing the 
night capabilities of military operators over currently employed systems, detailed 
exploration in the area of human factors is required. 
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This thesis is focused on the human factors of sensor fusion, more specifically, 
human perception of natural and artificial color images similar to those produced by sensor 
fusion processes. The objective of this thesis is to quantitatively assess the role of natural 
and artificial color in object recognition when shape information is degraded, investigating 
whether and how false color might be useful in multi-band fused imagery. Digital 
photographs of natural objects (fiiiits and vegetables) were presented as natural and false 
color images, together with their gray scale counterparts, degraded by different levels of 
noise, and comparing them in a standard naming task, trying to emulate imagery generated 
by multispectral devices. Two precise measures of visual ability that are critical to the 
military, reaction time (RT) and rate of accuracy in target detection, were measured. 

Natural color might facilitate object recognition in either or both of two ways; by 
facilitating scene segmentation (Walls, 1942) and by allowing access to stored color 
knowledge (Joseph and Proffitt, 1996). In the presence of false colored images, 
recognition may be disrupted, because the access to stored knowledge is denied and 
participants will rely just in color contrast as a way to reach object recognition through 
scene segmentation. It is hypothesized that shorter RTs and greater accuracy rates will 
occur within the natural color images across all levels of noise and the difference in RTs 
between natural color and false color images will be largest in the conditions with the 
greatest amount of noise. Faster RTs are expected within the natural color images 
because participants will use color information to access stored knowledge of the object’s 
chromatic features, and they will be able also to fulfill scene segmentation. Larger effects 
of natural color images are also expected in the conditions with higher levels of noise 
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because here, since the objects’ shape information is degraded, subjects may be expected 
to rely more heavily on color information to recognize the stimuli. The longest RTs and 
greatest error rates are expected within the gray scale images, because participants will not 
be able either to accomplish scene segmentation or to access stored knowledge during the 
object recognition task. Intermediate results will be achieved by false color images, due to 
the possibihty at least to fulfill scene segmentation, originated by the presence of color. If 
color is used only for scene segmentation similar effects of natural color and false color 
images are expected, although they should be faster and more accurate than the effects 
originated by the gray scale images. 
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n. BACKGROUND 



A. HISTORY 

The Vietnam War era witnessed the great expansion of the infrared industry. This 
industrial development was motivated by the inability of the U.S. forces to prevent North 
Vietnamese forces from conducting night operations (Schwarzkopf, 1992). 

Since the post- Vietnamese era, all military high value platforms possess NVDs. 
These systems, that use specific sensors and techniques necessary to acquire and engage 
opposing forces during low visibility or nighttime periods under adverse warfare 
environments, have been proven effective in all kind of combat operations (NVESD, 

1997). However, unanticipated problems have arisen while utilizing these devices. A 
human unaided perception of the surroundings at night is vastly different when observed 
with NVDs (Vargo, 1999). The user’s lack of understanding of the night environment and 
its impact on the NVDs performance has caused the capabilities of these devices to be 
exceeded, resulting in numerous mishaps (Salvendy, 1997). Also, the increasing 
sophistication of military tactics and weapon systems require enhanced capabilities that 
current NVDs are not able to accomplish (Krebs et al., 1998). Multiband image fusion 
devices, currently under development, are supposed to solve several of the existing 
limitations of the infrared systems and to achieve the tasks required by modem nighttime 
warfare. In these new devices, the information provided by each of the sensors in the 
system is combined into a single fused image before being displayed to an end user. The 
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resulting image is a multispectral false-colored rendering of the imaged scene. The 
expected advantage of fused images is not only choosing the most helpful effects of each 
of the fusioned sensors, but also obtaining additional information based on the difference 
between the sensors. 

This study will investigate one of the unsolved problems of these new NVD’s: the 
use of color in the resulting fused imagery. A generic presentation of how the human 
visual system (HVS) accomplishes the perception of color will provide a basic 
understanding of the problems related to adding color to multisensor fused systems. A 
general description and characteristics of single-band sensors currently in use are provided 
to aid in the comprehension of image fusion techniques and future multisensor devices. 
Previous research involving the role of color in object recognition is summarized, along 
with several studies that investigate and develop different techniques of color fusion. 



B. PERCEPTION OF COLOR IN THE HUMAN VISUAL SYSTEM 
1. Electromagnetic (EM) Spectrum 

The first characteristic of the night environment relevant to an understanding of 
night vision technology is the EM spectrum of the night sky and its relationship to the eye 
and to the NVD’s. NVD’s allow us to exploit a greater portion of the EM spectrum as 
compared to the human eye. This issue can be seen in Fig. 1 . 
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and to the NVD’s. NVD’s allow us to exploit a greater portion of the EM spectrum as 
compared to the human eye. This issue can be seen in Fig. 1. 

The NVG’s, FLIR, and most night imaging devices, including the human eye, are 
sensitive to different wavelengths of the EM spectrum. These frequency bands are 
similar in nature and their relationship can be clearly expressed by their position in the 
EM spectrum, shown in Fig. 2. 

As it can be seen, the optical band covered by visible light is a relatively small 
portion of the entire spectrum. Visible and near IR light are considered to be reflected 
energy, while the thermal bands in the mid and far infrared are primarily radiated energy. 

2. Human Visual System 

The human visual system (HVS) is sensitive to radiation whose wavelength is in 
the 0.4 to 0.7 micron range of the EM spectrum. When a combination of these radiations 
reach the human eye, neural processing of these signals will originate a psychological 
reaction called color vision. Visible radiation received by the HVS may come directly 
from a light source, but is usually reflected by object surfaces before reaching our eyes. 

Three primary perceptual dimensions of these radiations combine to define our 
psychological perception of color: hue (wavelength of the radiation), saturation (hue 
purity) and lightness (intensity of the light source). 

Hue is the reaction to wavelengths ranging from 0.4 microns (violet) to 0.7 
microns (red). As Newton demonstrated, white light really consists of a combination of 
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Figure 2: Visible Color Spectrum (Matlin & Foley, 1997) 
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different colored lights. Each wavelength included in this range after being processed by 
the HVS, produces the perception of a specific color, as it is shown in Fig. 3. 

One way to organize colors, proposed by Newton in 1704, is in terms of a color 
wheel. The outside of the wheel represents monochromatic colors (those that can be 
produced by a single wavelength) plus non-spectral hues (those that cannot be described 
in terms of a single wavelength from the visual spectrum). Similar hues are located near 
one another. 

In addition to hue, our experience of color is characterized by lightness and 
samration. Objects vary in the amount of light they reflect from their surfaces. Lightness 
is the apparent reflectance of a color. It describes our psychological reaction to the 
physical characteristic, reflectance. Objects’ lightness vary from very dark (black) to very 
light (white), with other shades of lightness in between (Matlin & Foley, 1997). 

Another characteristic of color is saturation, our psychological reaction to the 
physical characteristic, purity. Saturation measures the amount of white light added to a 
hue. A saturated hue, lying on the perimeter of the color wheel, no white light added, is 
perceived as a deep hue. An unsaturated hue will be closer to the center of the wheel and 
is perceived as a much lighter hue. Completely unsaturated colors are called achromatic 
or neutral, and they are perceived as white, shades of gray and black, depending on their 
amount of lightness. These colors are represented in the center of the color wheel, as it is 
shown in Fig. 4. 

The mixture of monochromatic hues produces the perception of the whole 
diversity of colors in the human visual spectrum. Hues can be mixed in two different 
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ways. Subtractive mixing involves a single source that passes through filters or falls in 
pigments. Parts of the spectrum are absorbed or subtracted, as it is represented in Fig. 5. 
Additive mixture is accomplished by adding or combining colored lights of different 
wavelengths, as it is represented in Fig. 6. The result of color mixing can be predicted by 
using the color wheel. A graphic explanation of this method is shown in Fig. 7. 

Color television is an example of additive mixtures. The screen consists of many 
tiny dots. When irradiated they glow blue, green or red. All different colors are produced 
by combination in our photoreceptors of the different lights generated in each screen dot 
when watched from an appropriated distance. A yellow patch is really composed of red 
and green dots (see Figure 4). 

Hue wavelengths are not evenly arranged around the periphery of the wheel. This 
distribution is necessary to place complementary hues on exactly opposite sides of the 
color wheel. Complementary hues are those whose additive mixtures make an 
achromatic color (shade of gray). 

By means of any of these two techniques, colored light reaches the human visual 
system (HVS), producing the perception of color. The way in which colored light 
produces the perception of color in the VHS is explained by two theories, each of them 
applied to different levels of the visual processing system. Trichromatic theory explains 
the way in which the input signal from the photoreceptors is combined (Neitz, Neitz & 
Jacobs, 1993). Opponent process theory explains how the information provided by the 
photoreceptors is interpreted by the neural system (De Valois & De Valois, 1975). 
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Figure 5: Subtractive mixtures for blue paint and yellow paint. 
(Matlin & Foley, 1997) 




White paper 



Figure 6; Additive mixtures for blue light and yellow light. (Matlin & Foley, 1997) 
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The Young-Helmholtz Trichromatic theory assumes that humans have three kinds 
of color receptors, each differentially sensitive to light from a different part of the visual 
spectrum. These receptors are called “cones” and they work best in well-lit 
environments, giving rise to the full range of colors (achromatic and chromatic). There is 
another kind of receptors in the human retina. These are the “rods,” which work best in 
poorly lit environments where they give rise to the perception of achromatic colors. 

Visual perception research has established that the three kinds of cone pigments 
have different but overlapping absorption curves (De Valois and De Valois, 1975), each 
of them being maximally sensitive to a different wavelength as it is shown in Fig. 8. 

We will refer to these three kinds of cones as S (short wavelength), M (medium 
wavelength) and L (long wavelength) based on the wavelengths to which they are most 
sensitive. In this way, human visual receptors are able to distinguish the wavelength of 
an incoming signal, because it will activate one or several receptors in a unique pattern or 
distribution for each wavelength. 

Trichromatic theory by itself cannot explain all the color phenomena. Some 
mechanism beyond the receptor level must combine the information from the cones in a 
complex way. Several pieces of evidence point to the existence of separate mechanisms 
for red, yellow, green and blue. How do these four mechanisms arise from only three 
cone systems? Human sense of color must arise from additional processing of the input 
from the three-cone system. 

Opponent-process theory (De Valois & De Valois, 1975) covers this second level 
of visual processing system, beyond the photoreceptors. This process is implemented 
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Figure 8: Absorption curves for the three cone pigments. 
(Matlin & Foley, 1997) 
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by means of the neural connectors among photoreceptors and neurons in the human retina. 
De Valois & De Valois (1975) modeled all these possible connections showing how both 
chromatic and achromatic information could be conveyed through identical mechanisms 
and it also illustrates how four color channels could arise from three cone systems (Matlin 
& Foley, 1997). This theory, whose development is far beyond the scope of this thesis, is 
basic in the development of several fusion algorithms (Waxman, Fay, Gove, Seibert, 
Racamato, Carrick & Savoye, 1995). 

Another characteristic of color is color constancy. Because of color constancy, 
humans tend to see the hue of an object as staying the same despite changes in the 
wavelength of the light illuminating the object. Variations in illumination light may arise 
by changes in the intensity or in the composition of the illumination source. Absolute 
color constancy would be obtained if an object appeared to be the same color regardless 
of the type of illumination or the colors of nearby objects (Maloney, 1993). Our 
perception of color, then, is not dependent on the absolute wavelengths reaching our 
retinas, but on reflectance relationships among objects in our field of vision (Brou, 

Sciascis, Lindeln & Lettvyn, 1986). Color constancy is probably not maintained 
completely. So, human color perceptions are influenced to a degree by the nature of the 
illumination. This lack of consistency for the intensity of reflected light required the HVS 
to develop a variety of mechanisms to disentangle the contradictions of varying 
illumination and thereby to achieve nearly constant color perception based on distal 
surface reflectivity (Matlin & Foley, 1997). Based on this characteristic of the HVS, color 
constancy seldom breaks down to the extent that an observer would assign two different 
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color names to the same object just because of changes in illumination (Jameson & 
Hurvich, 1989). When an object is seen under different illumination conditions, it might 
look slightly different, but it will still be recognized as the same color. A major limitation 
to sensor fusion systems is that these mechanisms cannot be duplicated to achieve the 
same constant color perception. 



C. CURRENT NIGHT VISION DEVICES 



Night vision devices (NVDs) enable exploitation of the night environment by the 
NVD user by processing EM bands outside the human visual spectrum. These devices do 
not allow perfect vision during nighttime operations, but they do enable humans to 
improve their performance in multiple tasks such as movement on foot or even night 
attacks using sophisticated weapon systems, both land based or airborne. 

Current military night operations are enabled through imaging in the visible-near 
infrared band (wavelengths of .57 to .9 microns) and in the thermal infrared band 
(wavelengths of 7 to 14 microns). Fig. 1 shows the portions of the EM spectrum covered 
by NVDs. Both types of NVD’ s are explained in more detail in the two next subsections. 
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1 . 



Image Intensifiers 



Image intensifiers (I^) process the visible and near-infrared spectrum and, much 
like the human visual system, depend almost entirely on reflected energy from scene 
illumination (MAWTS-1, 1995). They amplify reflected moonlight and starlight (primarily 
yellow through near infi-ared light, with wavelengths of 0.57 to 0.9 microns) and ambient 
light produced by artificial sources of illumination (visible light wavelength of 0.4 to 0.7 
microns). 

Visible and near-infrared imagery is currently provided by the third generation off 
tubes. The five major components of an tube are the objective lens, the photo cathode, 
the microchannel plate, the phosphor screen and the eyepiece lens. Radiant or reflected 
optical energy received at this device is focused, turned into electric energy, amplified and 
turned again into green -yellow light in the 0.56 microns range, matching the peak 
sensitivity of photopic human vision. Finally it is inverted and focused before reaching the 
operator eye. Image intensified imagery is usually displayed in night vision goggles 
(NVG’s). 

The ratio of the brightness of the image at the output of the eyepiece lens over the 
luminance of the light entering the objective lens is called the gain of the I^ tube. The 
variants of the Gen III NVG’s currently used have a gain of 25,000, a substantial 
advantage for the unaided human eye in the night environment (MAWTS-1, 1995). 

Illumination, expressed in lumens per square meter (Im/m^) or lux, measures the 
amount of visual energy that exists in a specific location. Lunar illumination is the primary 
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energy source for natural illumination in the night sky (MAWTS-1, 1995). Additionally, 
stellar phenomena and starlight provide certain amount of illumination. Figure 9 shows 
how moonless night sky illumination almost matches the peak sensitivity of NVGs. 

Two other contributors of illumination are the sun and artificial sources. The 
setting sun at zero degrees below the horizon is too bright for NVG operations, however, 
approximately one half hour after sunset, when the sun has lowered to seven degrees 
below the horizon, it may provide useable illumination until it has set past twelve degrees. 
Artificial lighting such as street lights or radio tower warning lights can also provide 
significant illumination, but large concentrations of artificial illuminators can wash out the 
NVG image. 




Figure 9: Moonless night spectral composition (MAWTS-1, 1995) 
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The atmosphere is the most important environmental factor controlling the 
performance of the NVGs. The atmosphere can attenuate the light, reducing the level of 
energy reaching the NVGs. This attenuation can occur by absorption or scattering mainly 
due to the fact that attenuation by refraction is almost negligible. NVGs operate by 
intensifying light energy between 625 to 960 nanometers. Any attenuation, either before 
or after it strikes the terrain, will effectively reduce the usable light available to the NVG 
and thus affect the resulting image. Attenuation is caused by impact of light particles 
with particles larger than one micron in length such as water vapor, dust, snow, and other 
natural or man-made obscurants. The effect of these particles will depend very much on 
their size and density, but all of them will affect distance estimation and depth perception 
reducing significantly the usefulness of these devices and even making them almost 
useless during adverse atmospheric conditions (MAWTS-1, 1995). 

2. Thermal Infrared Devices 

The thermal infrared devices, supported by several kinds of forward-looking 
infrared (FLIR) imaging devices, convert invisible thermal energy from the far infrared 
spectrum into a visible image. FLIR’s generally process emissions from two infrared 
bands, midwave (3 to 5 microns) and long wave (8 to 12 microns). Infrared energy 
(thermal energy) within these bandwidths is emitted by all objects with a temperature 
above absolute zero (-273 degrees Celsius). 
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Natural thermal energy is produced when objects that have previously absorbed 
thermal energy from IR sources, such as the sun or warm air currents, start radiating this 
energy. Another source of thermal energy is from man-made objects such as the heat 
radiated as a result of the friction from moving parts in mechanical devices (MAWTS-1, 
1995). It is important to note that most man-made objects emit in the 8 to 12 micron 
band, hence the military interest in LWIR sensors (Sampson, 1996). 

In order to measure the thermal energy radiated by an object we define emissivity 
(E) as the ratio of an object’s ability to emit thermal energy at a certain temperature over 
that of a black body at the same temperature. “Blackbody” is defined as the perfect 
absorber of thermal energy and therefore also a perfect emitter, with an efficiency of unity 
(MAWTS-1, 1995). Other factors impacting emissivity are material composition, ambient 
temperature and the object’s temperature and geometry. Most natural objects have a high 
emissivity and therefore a majority of their thermal signature is from self-emission. 
Conversely, objects with low emissivity have a corresponding high reflectivity and 
therefore reflect thermal energy of their surroundings. 

Thermal energy emitted by an object, whether it is internally generated or reflected 
by another source, determines its thermal signature. It is primarily the difference among 
thermal signatures of objects that defines the thermal scene (Sampson, 1996). An 
important measure of performance of a FLIR is “delta T” or the temperature difference of 
an object and its background (MAWTS-1, 1995). The cyclic heating and cooling of the 
terrain causes the diurnal cycle of temperature differences between objects of different 
thermal mass and inertia. 
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Figure 10 shows the diurnal cycle of temperature differences for an armored 
vehicle and the background terrain. From the graph one can visualize the negative 
thermal contrast (object cooler than background) of the armored vehicle on a clear sunny 
day and the positive thermal contrast (object warmer than background) of the armored 
vehicle at night. Because of the positive thermal contrast, FLIR’s will be able to detect 
targets against the background during night periods. 




Figure 10: Diurnal cycle example (MAWTS-1, 1995) 



Attenuation of thermal energy after it leaves its source can occur by absorption or 
scattering. Atmospheric vapor or humidity is the most significant absorber of thermal 
energy. In very hot and humid climates, the high amount of absorption may literally 
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render the FLIR useless (MAWTS-1, 1995). Molecular scattering occurs when thermal 
energy particles strike other particles present in the atmosphere, as nitrogen, oxygen, 
water vapor and carbon dioxide. Because of these strikes, thermal energy can be scattered 
in different directions making it difficult to reach the IR sensor (MAWTS-1, 1995). 

FLIR systems are complex and their detailed composition is far beyond the scope 
of this thesis. The basic elements of this device are the infrared sensor, the signal 
processor and the display unit. The detector array is composed of semiconductive 
material, which turns 8 to 12 microns heat energy into analog electrical output to the 
signal processor. The signal processor provides the special signal functions required to 
stabilize and enhance the analog output from the detector array. The signal from the 
processor is transformed to an image through the use of a cathode ray tube (CRT) and 
displayed on a “heads down display” (HDD), cockpit “heads up display” (HUD) or 
“helmet mounted display” (HMD) (MAWTS-1, 1995). Current FLIR technology is 
centered on the first generation (Gen I) FLIR thermal imaging device. The U.S. Army 
began integration of second generation FLIR’s into new and existing weapon systems to 
maximize U.S. forces advantage on the battlefield (NVESD, 1997). ER imagery is 
displayed using a variety of forward looking infrared (FLIR) imaging devices (both 
scanners and IR focal plane arrays) displayed on monochrome phosphor monitors, the 
cockpit heads-up display, or combiner optics. 
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D. MULTISPECTRAL IMAGE FUSION DEVICES 



The two sensing modalities currently used for night vision purposes (I^ and IR) 
have been improved due to increases in sensor detection ranges and display resolution, but 
they still have their own limitations, f' devices need reflected light for detection and IR 
devices must be able to detect thermal contrasts among the objects in the scene (Vargo, 
99). Recent advances in signal processing have permitted the possibility of combining the 
best attributes of the emissive radiation sensed by the thermal sensor and the reflected 
radiation sensed by the image intensifier sensor into a single “fused” image (Steele and 
Perconti, 1997). 

Long wave IR and I^ sensors are good candidates for image fusion. The thermal 
contrast between relatively high emissivity objects and the background is a good indicator 
of potential targets, obstacles and waypoints. The inability to see details in areas that have 
relatively poor thermal contrast, caused by low emissivity differences, might be greatly 
compensated for by fusing with an I^ sensor. Given the proper illumination conditions, the 
‘Visible” contrast can provide very useful cues that are independent of thermal conditions, 
f sensors might also aid in producing a natural representation of the scene due to the 
proximity of this wave band to that of the visible (0.4 to 0.7 microns) waveband (Steele 
and Perconti, 1997). 

Recent technological advances in the production of multi-spectral sensors now 
allows I^ and IR imagery to be mapped to a high speed processor where it can be fused 
and displayed to an end user (McDaniel et al., 1998). Some advantages of combining 
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multiple spectral imagery into a single display might be; (a) reduced cost, space, 
computational processing and weight requirements form combined resources, (b) reduced 
operator workload by limiting the need to alternate between the two sensors, (c) improved 
object search, detection and recognition. 

Numerous fusion techniques have been developed during the last years that 
produce both monochromatic and color imagery (Toet, van Ruyven & Valeton, 1989; 
Scribner, Satyshur & Kruer, 1993; Palmer, Ryan, Tinkler & Creswick, 1993; Toet & 
Walraven, 1 996;Therrien et al., 1997; Waxman, Gove, Fay, Racamato, Garrick, Seibert & 
Savoye, 1997). All these techniques may differ on the algorithm approach but they all 
have the same objective: improving the image quality for the observer (Krebs, Scribner, 
McCarley, Ogawa & Sinai, 1 999b). 

A typical color fusion process transforms the dual monochrome bands generated at 
the f and IR sensors onto display variables such as the red-green-blue (RGB) channels 
(based on human trichromatic vision theory) and opponent color processing (Waxman et 
al., 1997). This approach takes advantage of the observer’s color vision system to 
introduce additional dimensionality for interpretation through color contrast. The use of 
color in image fusion was frequently advocated under the argument that color contrast can 
provide improved detection performance when added to luminance contrast (Peli et al., 
1999). 

If the final result of a fusion process is presented in a monochromatic format, the 
whole capability of the HVS is not being optimally used. Objects viewed by low-light and 
infrared sensors will generally have the same spatial characteristics, but they will have 
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completely different contrast levels. By displaying these variations as color differences, as 
a result of sensor fusion techniques, target-background contrast should be improved and 
the dynamic range of the scene should be increased (Krebs et al., 1998). 

There is a very important distinction between a color scene observed by the 
unaided eye through the HVS and a processed color fused image. The color mapping 
process is affected by the specific information provided by the two sensors. Since neither 
sensor is in the visible waveband, the color algorithms used may not produce imagery that 
matches the colors seen by the HVS (Steele and Perconti, 1997). 

It is difficult to assume how beneficial multisensored colored images are going to 
be for human performance. Some experimental evidence indicates that object recognition 
depends on stored color knowledge of object’s chromatic characteristics (Joseph & 
Proffitt, 1986), therefore, incongruency of false color images may be disruptive of 
perceptual performance, and could even produce worse performance compared to single- 
band imagery alone (Sinai et al., 1999a), although overall evidence is equivocal as to what 
role color plays in object recognition. 

Another major limitation to sensor fusion systems is that HVS mechanisms cannot 
be duplicated to achieve color constancy, as it was previously stated in the HVS section. 
Therefore, varying illumination conditions will originate different chromatic 
representations of the same object. As it was stated above, past psychophysical research 
has been inconclusive in determining what is the role of color in human visual system and 
therefore, in determining what utihty if any color sensor fusion has for human 
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performance. A review of the literature might clarify which is the current situation in this 
field that has intrigued vision researchers for many years. 



E. REVIEW OF THE LITERATURE 

It is likely that color vision evolved in response to the changes of human behavior, 
during the process of adaptation to the natural environment. In Polyak’s view, color 
vision evolved to facilitate food gathering, involving search and recognition of natural 
objects (Polyak, 1957). Color might facilitate these tasks by means of scene segmentation. 
Walls suggested that color promotes the perception of contour (Walls, 1942). Color 
differences, like luminance differences, can be used to segment images into regions 
containing information about individual objects and provide more reliable information 
about object shape, because shadows also produce luminance contours (De Valois & 
Switkes, 1983). 

Color may also serve another perceptual function apart from scene segmentation: 
object recognition. Although virtually no object can be recognized on the basis of its 
color alone, the colors of some objects are less arbitrary than others, therefore objects 
with higher color diagnosticity, could be recognized in a faster way (Biederman & Ju, 
1988). 

Object recognition can be achieved by means of two different processes, both 
separately or in a combined way. During a bottom-up process, color is used to define the 
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contour or shape of the objects present in a scene. Once the shape of the object is defined, 
human memory recognizes the object based on that contour. This is normally the case of 
objects with low color diagnosticity. On the other hand, during a top-down process, color 
is used to access memory in a direct way, and allows subjects to distinguish an object 
among others of similar shape, just based on its color. This is the case of objects with high 
color diagnosticity. 

Assuming that scene segmentation is a bottom-up process, it should not depend on 
our knowledge of the colors of things. Diagnosticity, on the other hand, relies on 
memory. It might improve object recognition by restricting the set of possible alternatives 
(Wurm et al., 1993). 

There is disagreement, however, as to whether or not color is actually used to 
facilitate object recognition (Wurm et al., 1993). This disagreement can be attributed to 
several causes including differences in psychophysical tasks employed, differences in 
luminance characteristics across the color conditions, the use of different levels of shape 
degradation, and differences in types of objects used as stimuli and color formats 
employed. 

Three tasks have typically been used to determine the role of color in object 
recognition. These tasks are classification, verification and naming. In a classification 
task, participants are shown pictures or words that refer to a specific predesignated 
category (Price & Humphreys, 1989). In a verification task, a target name is presented to 
the participants and they must answer whether a subsequently presented stimulus matches 
the target or not (Ostergaard & Davidoff, 1985; Biederman & Ju, 1988; Joseph & Proffitt, 
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1996). During naming tasks, participants must verbally identify the object shown in each 
stimulus (Ostergaard & DavidofiF, 1985; Biederman & Ju, 1988; Price & Humphreys, 

1989; Wurmetal., 1993). 

Discrepancies in the results of previous research studies were analyzed by their 
own authors. Price & Humphreys (1989) stated that surface information may affect object 
naming more strongly than classification, and that the effects of naming may be most 
pronounced on objects that require most differentiation, because extra time is then 
required to differentiate any given object from its structurally related competitors. Joseph 
& Proffitt (1996) argued that their results differed from those of other studies for many of 
the same reasons that Price & Humphreys’s results differed. Because the objects they 
used in their verification tasks generally came fi'om structurally similar categories (animals, 
fhiits, vegetables, and flowers), that were also natural categories, it was not surprising to 
find an effect of surface color in their verification task. 

Other aspects that may originate discrepant results in color research will be 
considered in more detail in this section while accomplishing the literature review. 

Markoff (1972) measured reaction times (RTs) for subjects to decide which of 
three targets (tank, jeep or soldier) was present in a black-and-white or color slide. The 
targets were hidden in real-world backgrounds. He blurred the slides to evaluate the 
interaction of spatial resolution and color. He found that RTs were shorter (and error 
rates lower) for the color slides, and the advantage of color over black-and -white 
performance increased with great blur. These results indicate that color is helpful in a 
search task and that color may be more helpful when shape information is degraded. 
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Ostergaard and Davidoff (1985) investigated the role of color in the recognition 
and naming of everyday objects. Their study was based on the idea that color is unhelpful 
in shape processing, due to the considerable evidence for the separate processing of color 
and shape shown by anatomical and physiological research. Observations in monkeys 
indicate that the primate visual system consists of several separate and independent 
subdivisions that analyze different aspects of the same retinal image as color, depth, 
movement, and orientation. Human perceptual experiments are remarkably consistent 
with these predictions (Livingstone & Hubei, 1988). Therefore, they tried to find out at 
which stage of the object recognition process color and shape information combined, 
given that we are aware at the end of this process if objects are incongruently colored 
(Perlmutter, 1980). This means that color is not a part of the pictorial coding of objects, 
i.e., it is not necessary in order to describe the psychological description of the object, but 
rather it is stored as part of a set of attributes as depth, movement or orientation 
(Seymour, 1979). They hypothesized that any benefit from having color vision should be 
obtained from a later stage than identification (Ostergaard and Davidoff, 1985). 

The first experiment considered the role of color in object naming, because if color 
affects the processing of objects at any stage from early registration to the availability of 
the object name, this should be reflected in the object naming latencies. Twenty-four 
common fiiiits and vegetables were photographed on black-and-white and color film. 

Each participant was shown the complete series of pictures and was required to name the 
objects depicted in those pictures. Colored pictures produced significantly faster response 
latencies than black-and-white pictures. Therefore, the authors concluded that color 
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information is beneficial in object naming, although it was not clear whether those results 
occurred because color simply provides a separate cue for discriminating stimulus items, 
as part of a bottom-up process. 

In the second experiment they tried to solve this problem by only using items of 
identical color and thereby removing the possibility of using color to discriminate between 
alternatives. They compared object naming directly to object verification with three types 
of stimuli; items depicted in their natural color (always red), achromatic versions, and 
items depicted in an inappropriate color (always blue). This final condition was included 
to determine whether an inappropriate color would be interfering or merely have the same 
effect as no color. In the naming task, participants were required to name the depicted 
objects as quick as possible without making errors. In the verification task participants 
were required to respond positively when the target item was presented. Before each 
block of trials, the participants were informed what color the stimuli would be and they 
were shown the three alternative items. Color effect reached significance in the naming 
task, but it failed to reach significance in the object verification task. The naming 
advantage found for red pictures of objects could not be attributed to the stimulus 
characteristics of the colors used, so they concluded that it should be due to the 
meaningful conjunction of color and shape. They also concluded that there was no 
detrimental effect of wrong color compared to achromatic input (Ostergaard and 
Davidoff, 1985). 

The third experiment was run to verify the generality of the results of the previous 
one. In Experiment 2, the stimuli were blocked according to color type. In this 



32 



experiment, correctly colored, inappropriate colored and achromatic stimuli were 
presented at random. Again, the participants accomplished naming and verification tasks. 
Positive responses during the verification task were significantly faster than negative 
responses. All other factors and interactions failed to reach significance. Item color was 
found to be significant in the naming task. Paired comparisons revealed that correctly 
colored pictures were named significantly faster than either achromatic or inappropriately 
colored versions. There was no significant difference between achromatic and 
inappropriate versions. Thus, the major result of Experiment 2 was confirmed. 

These results show that color facilitates object naming but not object recognition. 
They found faster object naming for color than black-and-white pictures, and they believed 
this could be explained by assuming that objects are listed in semantic representation as a 
collection of physical attributes. One of these attributes is color, and they postulated that 
this attribute could be accessed directly by the physical color input. Another important 
conclusion of this study was that although correct color produced facilitation of object 
naming, inappropriate color did not cause significant inhibition in either Experiment 2 or 3 . 

Beiderman and Ju (1988) investigated the role of color in object recognition, 
comparing the latency at which objects could be named or verified when they were shown 
either as line drawings or color photographs. The empirical issue of this study was to 
determine if the presence of surface attributes of an object, such as color, facilitates the 
psychological representation of that object, over what can be simply derived by depiction 
of the object’s edges. Color diagnosticity among objects was also investigated trying to 
determine whether color and brightness were providing a contribution to recognition 
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independent of the main effect of photo versus drawings. For some kinds of objects, color 
is diagnostic of the object’s identity. For other kinds, normally man-made objects, color is 
not diagnostic. If color was contributing to object recognition, then it should be found 
that the former kinds of objects should benefit more than the non-diagnostic objects by 
their depiction as color photographs rather than as line drawings. 

It was expected that participants’ reaction times and error rates for the naming 
task would be smaller for color stimuli. In the verification task, each presentation was 
shown in one of three different exposure times followed by a mask. The exposure times 
were 50, 65 and 100 milliseconds in duration. In this task participants could anticipate the 
surface characteristics of almost all of the targets, and for the diagnostic objects, the color 
as well. Therefore, if participants were using the color to access an object mental 
representation, then objects photographed in color or those that were diagnostic should be 
recognized faster relative to the naming tasks. It was also assumed that longer exposure 
duration and slightly dimmer projector intensity would favor colored photography 
(Biederman and Ju, 1988). 

The results of the experiments did not match the authors’s assumptions. Over the 
five experiments, mean RTs and error rates for naming or verifying line drawings were 
virtually identical to those for color slides. Even objects with diagnostic colors did not 
enjoy any advantage when presented as color slides during the verification task. They 
found a mean advantage favorable for the line drawings and also favorable for the objects 
with no diagnostic color. The conclusion for these studies was that a simple line drawing 
could be identified about as quickly and as accurately as a colored photographic image of 



34 



that same object. Color diagnosticity did not facilitate object recognition. These results 
support the premise that the access to a mental representation of an object can be 
accomplished with an edge-based representation of a few simple components. Color plays 
only a secondary role in recognition when edges can be readily extracted (Biederman and 
Ju, 1988). 

In contrast to Biederman’s results, some authors speculated that surface 
characteristics should facilitate recognition. Following this line. Price and Humphreys 
(1989) examined the effects of color congruency and photographic detail on the naming 
and classification of objects fi-om structurally similar and structurally dissimilar categories. 
Color congruency was also assessed by contrasting performance with correctly colored 
line drawings of objects, black-and-white outline drawings and line drawings assigned very 
incongruent colors. 

To clarify all these effects, the authors conducted a series of experiments in which 
participants performed naming, subordinate (only with stimuli from structurally similar 
categories) and superordinate classification tasks. Color, when present, was part of the 
surface description of objects in Experiments 1 and 2. The influence of color in 
participants’ performance as part of the object’s surface description was examined in 
Experiment 3 by testing the effects of colored backgrounds on object naming (Price and 
Humphreys, 1989). 

Price and Humphreys hypothesized that object color and surface details would be 
beneficial for discriminating between categorical members, because these objects require 
greater differentiation to separate the target object fi-om competitors of the same category 
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during a naming task. Their findings supported this hypothesis and revealed that the 
influence of surface color in object recognition is not only reserved for naming tasks. 
Classification of objects can also benefit fi-om the use of congruent surface color when 
shape information is not suflBcient for discriminating among category members. The 
implication of Price and Humphreys’s findings is that effects of surface color are not 
necessarily reserved for the later, name-retrieval stages of processing (Joseph and Proffitt, 
1996). 

Joseph and Proffitt (1996) conducted a series of experiments to determine the 
influence of color as a surface feature, i.e., its role during a bottom-up process, versus its 
influence accessing stored knowledge during a top-down process, in order to achieve 
object recognition. They defined stored color knowledge as semantic information about 
the prototypical colors of objects, such as the knowledge that apples are typically red. 

They considered that the role of stored knowledge of color in object recognition 
had not been examined deeply enough in previous studies. They also considered that the 
findings in the literature, concerning the role of color in object recognition, have yielded 
mixed results. They argued that for surface color to be a useful cue for recognition, the 
participant must decide if the surface color is appropriate for an object. Therefore, they 
would have to access an object’s semantic description for this check process to occur and 
compare it with the surface color present in the image. This study investigated whether 
the decision that the stimulus matches the target or not, depends more on the activation of 
stored knowledge or on the processing of surface color. 
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Joseph and Proffitt conducted three experiments to investigate the roles of surface 
color and stored color knowledge in object recognition. Pictures of natural objects were 
used as stimuli because most of natural objects have prototypical colors as opposed to 
man-made objects in which color is quite more arbitrary. In this way they could measure 
the influence of color, when participants were presented natural objects showing 
completely different colors from those stored in their memories. According to the results 
of the first experiment, congruent surface color made recognition easier than did 
incongnient color, with a verification task. Congruently or incongruently colored line 
drawings of common natural objects were presented briefly, masked, then followed by a 
label. An object is considered congruently colored when it is showing the most 
prototypical color, that is the one that the vast majority of people have stored in memory 
as related to that specific object. These results appeared to conflict with Biederman and 
Ju’s (1988) findings, but the authors argued that the reason for this discrepancy might be 
the different natures of the stimulus sets. Thus, use of surface color as a cue for 
recognition is more beneficial for objects from natural categories (Joseph and Proffitt, 
1996). 

Results from the second and third experiments confirmed their hypothesis. They 
concluded that the processing of shape information was more influential than any source 
of color information in object recognition, but when response interference could not be 
attributed to shape information, i.e., when both stimulus and target had similar shape, 
stored color knowledge was an overriding factor relative to surface color. They also 
found that the activation of stored color knowledge did not depend on the presence of 



37 



surface color, because even the identification of uncolored pictures was affected by stored 
color knowledge. They examined the effect of stored color knowledge by observing RTs 
and error interference when semantic associations of color were present or absent. For 
example, an uncolored picture of an apple might have been followed by the label cherry or 
by the label blueberry. More interference should occur with the label cherry because 
apples and cherries share a prototypical color. Apples and blueberries are different in 
prototypical color and, therefore, interference should be less (Joseph and Proffitt, 1996). 

Wurm, Legge, Isenberg, and Luebker (1993) investigated the role of color in 
object recognition trying to find out if color facilitates recognition in images with low 
spatial resolution. Previous studies have concluded (Markoff, 1972, Ostergaard and 
Davidoff, 1985, Biederman and Ju, 1988) that color improves object recognition more, 
when spatial resolution is low (blur or noise) or when shape information is less specific 
(fhiits and vegetables vs. man-made objects). The major purpose of this research was to 
examine the hypothesis that color and shape information interact in object recognition, 
that is, color facilitates object recognition more when spatial resolution is low. 

In their two main experiments, participants were presented full-color and gray- 
scale images of twenty-one different food items and vegetables. They chose to use food 
objects because they have a wide range of colors and shapes, so they were representative 
of natural objects. These objects may provide a favorable domain for revealing a role of 
color, given that color vision probably evolved in response to functional interaction with 
natural objects (Polyak, 1957). 
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The authors considered luminance as a factor that could have led to disagreement 
among the results of earlier studies examining the role of color in object recognition. 
Luminance characteristics varied across the color conditions in several of previous studies. 
In Markoff (1972) and Ostergaard & Davidofif (1985) studies, the distributions of 
luminance were not matched in the color and black-and-white slides. Biederman and Ju 
(1988) compared line drawings with color photographs. Visual analysis of the color 
photographs may have been disadvantageous, because of greater difficulty in edge 
extraction compared with line drawings (Wurm et al., 1996). To avoid this problem, 
Wurm and colleagues employed only gray-scale images matched pixel by pixel in 
luminance with the color images. 

Wurm, Legge, Isenberg and Luebker (1993) were also interested in examining if 
color and shape information interact in object recognition, such that color facilitates 
recognition more when spatial resolution is low. Psychophysical and computational 
studies show that chromatic contrast sensitivity is confined to a lower spatial frequency 
range than luminance contrast sensitivity (Kelly, 1983; Mullen, 1985; Derrico & 
Buchsbaum, 1991). These studies support the authors’ hypothesis about interaction 
between color and blur, assuming that chromatic contrast (color differences) can facilitate 
recognition when high-frequency information is removed by a shape degrading factor as 
blur. 

In one experiment of Wurm and his colleagues’ study, both full-color and gray 
scale images were presented in two resolutions, blurred and unblurred, during a naming 
task. RTs were shorter in the full-color unblurred condition and longest in the gray scale 
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blurred condition. They concluded that color does improve recognition of food objects 
whether measured as accuracy or RT, but they did not find the hypothesized interaction 
between color and spatial resolution. Two additional experiments were conducted to 
examine the origins of this effect. They investigated if shape prototypicality or color 
diagnosti'city facilitated object recognition. They found that participants were faster at 
recognizing images judged to be highly prototypical (where the object is shown from its 
most common point of view), but that less prototypical images benefit more from color, 
that is, show a greater reduction in RT. These findings are consistent with Biederman and 
Ju (1988) view that primary access to object recognition uses structural (geometrical) 
representation of objects and this representation is in part generated by the presence of 
color. The results of Experiment 5 suggested that participants’ explicit knowledge about 
food color (diagnosticity) does not account for the advantage of color in real-time object 
recognition. 

The authors questioned how color and shape could act additively and non- 
interactively in object recognition. They argued that perhaps color contributes to an early 
stage of contour extraction and scene segmentation (De Valois and Switkes, 1983; Walls, 
1942). That role is likely to rely on low spatial frequencies and hence, be relatively 
insensitive to blur. Thus, they concluded that although color does improve object 
recognition, the mechanism is probably sensory, rather than cognitive in origin. 

Otherwise, it would be related to people’s knowledge of the colors of things, but this 
would not match with the results of their color-diagnosticity experiment (Wurm et al., 
1993). 
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Several of these experiments concluded that shape is the basic element in object 
recognition (Ostergaard and DavidofF, 1985; Biederman and Ju, 1988; Wurm et al., 1993, 
Joseph and Proffitt, 1996) and that color plays a secondary role, facilitating object naming 
as a final step in object identification (Ostergaard and Davidoff, 1985; Wurm et al., 1993). 
But when object shape is degraded, color may play a more important role facilitating 
object recognition by scene segmentation (Biederman and Ju, 19880). 

The discrepancy that exists among the researchers related to the role of color in 
object recognition, has originated a similar question about the use of color in 
multisensored fusion, in which artificially colored images are supposed to improve human 
performance in target detection and situational awareness. Part of the discrepancy of 
these latter studies may be originated by the different fusion algorithms used or by the 
wide variety of psychophysical tasks employed to measure behavior (Sinai et al., 1999b), 
as can be seen in the following summarized experiments. 

Waxman, Gove, Seibert, et al. (1996) conducted an experiment trying to evaluate 
human perception during a visual search task. The detection of embedded small low- 
contrast targets in natural night scenes, was measured in terms of reaction time, and 
accuracy. Visible, infrared, color fused, and two forms of fused gray scale images, were 
shown to the participants, whose task was to determine whether the hidden target was on 
the right half or the left half of the screen. Although the report of this study does not 
show any statistics supporting the results, RTs during the detection task were fastest when 
color fused imagery was used, across various levels of target contrast. 



41 



Toet, Ijspeert, Waxman and Aguilar (1997) investigated if the increased amount of 
detail in the fused images can yield an improved observer performance in a task that 
requires situational awareness. Fused images were obtained from low visible and thermal 
signals, using two different fusion methodologies. The stimuli presented to the 
participants were in six different chromatic formats: Fused color images generated by two 
fusion algorithms, gray level images representing the luminance component of the fused 
color images, and gray level images representing the signals of the low-visible and the 
infrared cameras. The task required the detection and localization of a person in the 
displayed scene, relative to some characteristic details that provide the spatial context. 

Visual and thermal contrasts were low, since stimuli were collected just before and 
after sunrise. Visual contrast was low due to low luminance of the sky. Thermal contrast 
was also low due to the similar temperature of the objects in the scene. The authors 
hypothesized that the fusion of images registered in these conditions would result in 
images that represented both the context (background) and the details with a large thermal 
contrast (like people) in a single composite image. The results showed that participants 
could indeed determine the location of a person in a scene with a significantly higher 
accuracy when they performed with fused images, compared to the other chromatic 
formats. The two color fusion algorithms yielded the best overall performance, producing 
error rates of 1 .5% and 1.9%, while the corresponding gray scale fused images, 
respectively, produced error rates of 4.5% and 4.9%. The error rate for the thermal 
images was 8%, and for the visual images was 20%. The authors concluded that color 
contrast in fused imagery does help in target detection. 
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The objective of the study conducted by Steele and Perconti (1997) was to 
determine whether color fusion processes, were of benefit to helicopter pilots in the 
performance of night helicopter flights. Specifically, the authors investigated whether 
adding synthetic color to night vision multisensor (visible near infi-ared and long wave 
infi-ared) fused imagery, aided pilots in interpreting spatial relationships and improving 
situational awareness. The study consisted of a part task simulation, with three task 
groups: object recognition and identification, horizon perception and geometric 
perspective tasks. Object recognition and identification tasks are those tasks that required 
the participant to either determine if a specific object was present, locate a specific object 
and determine its position in the field of vision, and provide detail information about an 
object. Horizon perception tasks are those tasks that required the participant to determine 
whether or not the perceived horizon was level. Geometric perspective tasks required the 
participant to identify the shape or orientation of an object using monocular depth 
perception cues. 

Images were presented in five different chromatic formats: monochrome, FLIR 

monochrome, a gray scale fusion algorithm and two different color fusion algorithms. 

Each task group yielded different results for the three general types of visual tasks used, 
although in general fusion based formats resulted in better participant performance. The 
authors concluded that the benefits of integrating synthetic color to fused imagery are 
dependent on the color algorithm used, the visual task performed, and scene content. In 
the object recognition task, both the FLIR and the gray scale fusion formats resulted in 
significantly faster RTs. In the horizon perception tasks no significant differences were 
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found among response times and accuracy. In geometric perspective tasks the gray scale 
fusion algorithm produced significantly faster RTs than the FLIR alone. There were no 
significant differences among the other formats. The two color fusion algorithms 
examined in this study represent two very different approaches. Therefore, it is not 
surprising to find these two algorithms on opposing ends on some of the data plots of this 
study (Steele and Perconti, 1997). 

The purpose of the study conducted by Krebs et al. (1998) was to modify the 
existing F/A-18 targeting FLIR system by adding a dual-band color sensor to improve 
target contrast and standoff ranges. The authors argued that objects viewed by low-light 
and infi'ared sensors would have dramatically different contrast levels between each 
system. Therefore, displaying these variations as color differences should improve target- 
background contrast and increase the dynamic range of the scene. When searching for a 
target, color should help by giving better context to the scene, as a result of the higher 
contrast among objects in the scene (scene segmentation), thus allowing for more efficient 
pilot orientation and target detection. 

This experiment used eight nighttime video sequences collected from an early 
prototype fusion sensor system developed by Texas Instruments and the Night Vision and 
Electronic Sensors Directorate (NVESD). Each of the sequences was presented in five 
different image formats; low light visible imagery, infrared imagery, gray scale fused 
imagery and two different color fused imageries. It was hypothesized that these images 
should be maximally optimized for target discrimination. A standard visual search task 
was used to assess whether pilots’ situational awareness was improved by using sensor- 
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fused imagery. Participants responded faster to the infrared target compared to one of the 
color fused target, while the other color fused target showed no significant difference. 
These results generally agree with Steele and Perconti’s (1997) study that used the same 
videotaped sequences. The authors concluded that color fusion did not improve pilots’ 
situational awareness. Pilots reported that the color fused scene appeared unnatural due 
to the choice of colors. However, pilots did report that color fused objects were easier to 
discriminate than JR or objects, because of the color contrast that facilitates 
discriniination from the background noise. Therefore, color fusion may be more 
appropriate for targeting applications compared to navigation and pilotage applications. 

The study conducted by Sinai, McCarley, Krebs and Essock (1999b) compared 
performance on two different tasks, an object recognition task and a situational awareness 
task. The authors hypothesized that performance on these two very different tasks would 
be differently affected both by the single sensor imagery and by the fused imagery. They 
hypothesized that performance on the detection/recognition task should be better for the 
IR imagery, because IR images usually have higher contrast than the image. Likewise, 
they hypothesized that performance should be slightly better for the imagery compared 
to the IR imagery for the situational awareness task, because IR imagery has lower 
resolution than the I^ imagery. The authors also argued that the fused imagery would 
result in performance at least as good as the better of the two single band sensors. 

Stimuli were images collected using long-wave IR sensor and I^ low-light sensor. 
Six image formats were tested: single band IR and low-light formats, two color-fused 
formats and two achromatic fused formats, with each of the fused formats using IR 
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imagery of white-hot polarity or black-hot polarity respectively. Two experiments were 
conducted. The first required participants to detect a target (person, vehicle or neither) 
against naturalistic backgrounds. The second measured participants’ situational awareness 
by asking them to decide whether the scene was inverted or not. 

Significant differences were found between the white-hot color fused error rates 
and the white-hot gray scale fused error rates for both tasks. Thus, the false-color of the 
fusion algorithm improved performance for this format and for both tasks. The only 
difference between the two formats was the addition of color. In the other fused format 
tested, however, color not only did not improve performance but even actually hindered 
performance by increasing error rates in both tasks. The results of this study showed great 
evidence for the benefits that color-fused imagery can produce in human performance, but 
also demonstrated how drastically results may vary according to tasks or algorithms used 
in the research (Sinai et al., 1999b). 

In sum, several of these experiments concluded that color fusion facilitates target 
detection (Waxman et al., 1996) and situational awareness (Toet et al., 1997; Sinai et al., 
1999b), one concluded that just targeting applications but not situational awareness may 
benefit fi-om a color fusion scene (Krebs et al., 1998) while Steele & Perconti (1997) 
argued that the benefits color fusion depends on the color algorithm used, the visual task 
performed and scene content. 

Summarizing the plausible benefits generated by the use of colored imagery, there 
is certain evidence that color may play an important role in object recognition when shape 
is degraded, both by means of scene segmentation in a bottom-up process, and by 
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accessing stored knowledge in a top-down process, if colored imagery is congruent. For 
these same reasons, the use of color in fused imagery, where most of the times the 
contours of the objects will not be sharply defined, seems to be useful too in object 
recognition at least achieving a bottom-up process in which color contrast can facilitate 
contour definition, although there exists the possibility that color incongruency may 
originate disruptive effects, as it was shown in the reviewed literature. Therefore, it seems 
to be enough support to move forward in the research regarding the role of natural color 
and false color in object recognition. 



F. HYPOTHESIS 

In an effort to continue in this field of research, avoiding some of the deficiencies 
detected in the past and summarizing several of different techniques used in previous 
studies, this thesis will conduct a human performance experiment by measuring reaction 
times and error rates during a standard object naming task, trying to examine how natural 
and artificial color facilitates object recognition when objects’ shape information is 
degraded. Naming was chosen as the psychophysical task for this experiment for two 
reasons. First, it provides a way to check the accurate and positive identification of the 
object presented in each stimulus to the participant. Also, because if color affects the 
processing of objects at any stage from early registration to the availability of the object 
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name, this should be reflected in the object naming latencies (Ostergaard and Davidoff, 
1985). 

Digital photographs of natural objects (fruits and vegetables) were used. Common 
food objects also provide a favorable domain for studying the interaction of color and 
shape due to their natural although limited variation of both attributes within a category 
(Wurm et al, 1993). The familiarity and non-arbitrary colors of these objects might 
encourage participants to use color for recognition, perhaps especially in those cases when 
shape is no so helpful due to its similarity among several stimuli. 

The effect of natural and artificial color was examined by comparing natural and 
false color imagery with their gray scale counterparts as a control for luminance. Since 
colored images and their gray scale equivalents were matched in luminance, any advantage 
measured in the colored imagery should have been originated by the presence of color. 

Gaussian monochromatic noise was used as an image-degrading factor. The aim 
of using noise was to achieve some type of image degradation in order to examine how 
recognition might be affected under the degraded viewing conditions that occur with night 
vision devices. 

It was hypothesized that if stored color knowledge affects object recognition, 
shorter RTs and smaller error rates would occur within the natural color images across all 
levels of noise, and that the difference in RTs and error rates between natural color and 
false color images would be largest in the conditions with the greatest amount of noise. 
Faster RTs and smaller error rates were expected within the natural color images because 
participants would be able to use color information to access stored knowledge of the 
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participants would be able to use color information to access stored knowledge of the 
objects’ chromatic features. Larger effects of natural color images were also expected in 
the conditions with higher levels of noise because here, since the objects’ shape 
information was degraded, subjects might be encouraged to rely more heavily on color 
information to recognize the stimuli. The longest RTs and greatest error rates were 
expected within the gray scale images, because participants would not be able either to 
accomplish scene segmentation or to access stored knowledge during the object 
recognition task. Intermediate results would be achieved by false color images, because 
participants, although they were not able to access stored knowledge of color, at least they 
would be able to achieve scene segmentation and speed up recognition, compared to gray 
scale images. The four stated hypotheses are summarized below: 

• Shorter RTs and smaller error rates were expected within natural color stimuli 
across all levels of noise. 

• Differences in RTs and error rates between natural color and false color stimuli, 
were expected largest for greatest levels of noise. 

• Longest RTs and greatest error rates were expected within the grayscale images. 

• Intermediate results were expected for false color images. 

These will be the Alternative Hypotheses for the statistical tests. The Null 
Hypotheses will be that there are no differences. 
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IIL EXPERIMENT 



A. METHODS 

1. Participants 

Thirteen students (eleven male and two female) from various military services and 
job specialties, undergoing academic studies at the Naval Postgraduate School, and 
ranging in age from 28 to 38, voluntarily participated. Participants who volunteered for 
this study might not represent a broad spectrum of the population but their 
psychophysical characteristics were very similar to those of potential NVDs operators. 

All participants were screened for normal color vision with the Pseudo-isochromatic 
Plates and had at least 20/20 corrected vision. Participants were naive to the purpose of 
the experiment. All participants were native English speakers. All participants granted 
informed consent prior to participation. 

2. Apparatus 

The experimental workstation consisted of a 200 MHz Pentium personal 
computer equipped with a Texas Instrument TMS-340 Video Board and the 
corresponding TIGA Interface to Vision Research Graphics software. The stimuli were 
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MF-8521 High Resolution color monitor (21” X 20” viewable area) equipped with an 
anti-reflect, non-glare, P-22 short persistence CRT. Pixel size was .26” horizontal and 
.28” vertical. Resolution was 800 X 600 square pixels and the frame rate was 98.7 Hz. 
Luminance of the monitor was linearized by means of an eight-bit color look-up table 
(LUT) for the red, blue and green guns. Moderate ambient luminance was maintained 
during the test. Viewing distance was approximately 100 cm and the participants were 
free to move their heads. 

3. Stimuli 

Experimental stimuli were digital images of twenty-three fruits and vegetables 
whose names appear in Table 1 . Images were photographed by the experimenter with a 
Kodak digital camera. Model DC50. Objects were photographed in the early afternoon 
under natural daylight against a background of white paper. Each object was 
photographed from four different viewpoints, all of them judged to be canonical by the 
experimenter. Viewing distance was varied in order to make all objects occupy 
approximately the same area within the photograph. From the initial set of twenty-three 
objects photographed, the nineteen objects judged by the experimenter to be the most 
common and easy to name were selected for use in the experimental trials. The four 
remaining objects were reserved for use in the practice trials. 

Images were then manipulated using commercially available image processing 
software (Adobe Photoshop 4.0). Images were first cropped to a rectangle 600 X 500 
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STIMULUS OBJECTS 



APPLE 


AVOCADO(*) 


BANANA 


BEANS 


BROCOLI 


CABBAGE 


CARROT 


COCONUT 


CORN 


CUCUMBER 


GARLIC(*) 


GRAPES 


LEMON 


MUSHROOM 


ONION 


ORANGE(*) 


PEAS 


PEAR 


PEPPER 


POTATO 


RADISH 


TOMATO 


ZUCCHINI(*) 





(*) Object used for practice trials only. 

Table 1: Object names. 

pixel size, subtending 1 1.4° X 10.2° of visual angle from a viewing distance of 100 cm, 
and then rendered in each of four different color formats: natural hue chromatic, natural 
hue achromatic, false hue chromatic and false hue achromatic. False hue images were 
obtained by replacing the color of each pixel in a natural hue image with its 
complementary color. Complementary colors are a pair of colors which when mixed 
additively, appear as white. Along a color wheel, complements are any two colors 
separated by 180 degrees, that is, any two colors at opposite ends of a single diameter. For 
consistency, all images of natural hue were reassigned a value of +180 degrees in order to 
get their false color counterparts. The value of +180 degrees was chosen arbitrarily and is 
of no theoretical significance. Natural hue achromatic and false hue achromatic images 
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chromatic natural hue and false hue images, respectively, to gray scale. Gray scale images 
matched their chromatic counterparts in pixel-by-pixel luminance. The purpose of these 
gray scale images was to provide a control for any changes in luminance that accompanied 
manipulations of color within chromatic images. 

Once the four sets of stimuli were obtained, degraded images were produced by 
addition of achromatic gaussian noise to undegraded images. Noise was added by 
increasing or decreasing the intensity of each pixel within an image by a value draw 
pseudo-randomly from a Gaussian distribution with a mean of 0. The standard deviation 
(SD) of this distribution determined the level of degradation. Three levels of noise were 
used. At the first level images were not degraded (SD of noise distribution = 0). At the 
second and third levels, images were degraded with values drawn from Gaussian 
distributions with SDs of 50 and 100 units respectively. Color formats and degradation 
levels were crossed factorially. 

Mean luminance of each image in the experimental set of stimuli was calculated. 
The mean value and standard deviation of the luminance values for each color format and 
level of noise was computed. These results are shown in Table 2. As it is shown in Table 
2, average luminance for all color formats is almost a constant for each level of noise, 
although values of individual images changed more drastically. Pixel values of less than 
zero or greater than 255 were set to values of zero and 255, respectively, when noise was 
added. Therefore, mean pixel values decreased slightly as noise increases, tending toward 
a value of 128 (50 cd/m2). The mean luminance for each chromatic format is almost 
constant too, (Natural hue chromatic =58.15 cd/m^ false hue chromatic = 57.92 cd/m^. 
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cd/m“, natural hue monochromatic = 58.08 cd/m^, false hue monochromatic =58.12 
cd/m“). Therefore, differences in RTs between color images and their gray scale 
counterparts cannot be attributed to differences in luminance. 





COLOR FORMAT 


Natural Hue 
Chromatic 


False Hue 
Chromatic 


Natural Hue 
Monochro- 
matic 


False Hue 
Monochro- 
matic 


NOISE 

LEVEL 


0 


58.84 


58.61 


58.85 


58.83 


9.31 


8.63 


9.20 


8.43 


1 


58.69 


58.45 


58.60 


58.63 


8.88 


8.82 


8.81 


8.07 


2 


56.89 


56.70 


56.79 


56.89 


7.08 


6.56 


7.13 


6.35 



Table 2: Mean values and standard deviations of luminance for each color 
format and level of noise in cd/m^. 



4. Procedure 

Each of nineteen objects was presented at random in twelve different formats, and 
from two different points of view selected at random from among the four available 
views of each object. A total of 456 (19X 12X2) stimuli were presented to each 
participant as experimental trials, and fifteen stimuli were presented for practice 
purposes. 
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Each subject was thoroughly briefed on the background and procedures of the 
experiment and was given the opportunity to ask questions. Prior to testing, participants 
read a list of the nineteen food categories in the experimental trials and the four categories 
in the practice trials. They were told that their task was to name as rapidly and accurately 
as possible the object presented in each stimulus. They were also told that the objects 
would fill the viewing window on the screen (i.e., there would be no scale information) 
and that items could appear alone, or in groups of items of the same type (beans, grapes, 
peas and radishes). 

Participants were tested one at a time. They were seated in front of the monitor at 
distance of 100 cm. The experimenter was seated in front of a different monitor in the 
same room, fi-om which he could not see the stimulus images and remained unaware of the 
format in which each image was presented. There was a warning tone to alert the 
observer that the trial was going to start followed by a pause of 500 msec before the 
image was presented on the screen. Each image remained on the screen until the observer 
responded. Upon hearing the observer’s response, the experimenter immediately pressed 
a key to stop the timer and one more key to record the accuracy of the response (1 for 
“true”, 2 for ‘Yalse”), based on the correct response that appeared on the experimenter’s 
monitor. No feedback was provided following any response. The subsequent trial 
followed after an intertrial interval (ITI) of approximately 1,000 msec. A uniform gray 
patch of the same size as the food images was shown on the screen during ITI’s. The 
experimenter allowed participants to relax for a short period of time after each group of 
40 images. Each participant completed two experimental sessions, with each session 
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composing a block of 228 trials. Within a block, participants observed each of the 
nineteen objects, once in each of the twelve image formats (19 X 12). The point of view, 
from which each object was viewed, was chosen randomly. The entire experiment lasted 
approximately 50 minutes (25 minutes for each session). 

Each subject was given a block of 15 practice trials prior to the first session of the 
experiment. All the participants held both sessions on the same day. The practice trials 
were presented in the same format as the experiment stimuli. Upon completion, 
participants were offered a brief pause to ask any question. The actual experiment then 
started. Reaction time and accuracy were recorded for each trial. 

5. Experimental Design 

The experimental design for this study was a2X2X3X2 within-subjects 
factorial design. Factors included color (chromatic or achromatic), hue (natural or false), 
level of noise (0,50,100), and block (Sessions 1 or 2). Sex was not considered as a factor 
for this experiment. Each of the 12 cells contained 38 observations per participant (19 
objects X 2 replications) for a total of 456 data points per participant. Reaction times for 
incorrect responses, and all data from the 1 5 practice trials were excluded from data 
analysis. Previous calculations determined that the number of participants (13) was 
sufficient in order to achieve statistical power greater than 0.8 under all hypotheses. 
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B. DEVELOPMENT OF THE STATISTICAL MODEL 



As with any psychophysical experiment, inconsistencies occur between various 
participants’ RTs and error rates under different experimental conditions. These 
differences arise in part due to variances in individual participants, image conditions and 
manipulation. To account for these differences, the following model is proposed: 

RTijkimn (or Error rates) = Parti + Colorj + Huek + Noisei + Blocks + errorn 
Where Parti represents the ith participant of the experiment; 

Colorj represents the color of the image (chromatic or achromatic); 

Hueic represents the hue condition of the image (natural or false); 

Noisei represents the level of achromatic Gaussian noise (0,50,100); 

Blockm represents each of the two sessions of the experiment (SI or S2); 
errorn, represents unaccounted variations within the model. 



C. RESULTS AND DISCUSSION 

Mean RTs and error rates for all twelve combinations of color conditions and 
levels of noise, averaged across participants, are shown in Table 3 and Table 4. These 
same results are also represented graphically in Fig. 1 1 and Fig. 12. 

Figure 1 1 illustrates how RTs increased for all four color formats when noise 
increased. Natural color (NC) RTs were the shortest at each level of noise compared to 
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COLOR CONDITION 


Natural hue 
Chromatic 


Natural hue 
achromatic 


False hue 
chromatic 


False hue 
achromatic 


NOISE 

LEVEL 


0 


910.10 


914.58 


968.15 


951.09 


11.09 


9.76 


12.92 


15.11 


1 


940.12 


971.01 


975.93 


1005.88 


13.71 


12.95 


13.86 


16.54 


2 


1015.23 


1091.01 


1064.08 


1146.12 


17.26 


17.21 


18.31 


23.73 



Table 3: Mean reaction times and standard errors (msec) for each level of noise and 

color condition. 





COLOR CONDITION 


Natural hue 
Chromatic 


Natural hue 
achromatic 


False hue 
chromatic 


False hue 
Achromatic 


NOISE 

LEVEL 


0 


0.40 


1.21 


0.61 


2.23 


0.28 


0.44 


0.34 


0.66 


1 


1.62 


3.04 


3.24 


4.05 


0.57 


0.78 


0.72 


0.84 


2 


3.04 


8.12 


8.30 


10.53 


0.88 


1.37 


1.28 


1.49 



Table 4: Mean error rates and standard errors (%) for each level of noise and color 

condition. 
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Figure 11: Mean reaction times (msec) for each level of noise and color 
condition, with one SEM error bars. 
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Figure 12: Mean error rates (%) for each level of noise and color condition, with one 

SEM error bars. 
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the other color formats, as was predicted by the first hypothesis. The differences in RTs 
between NC and FC conditions, however, did not increase as the level of noise increased, 
in opposition to the statement that was made in the second hypothesis. These differences 
were almost constant across all levels of noise, although RTs for NC images were always 
shorter than FC RTs at the same level of noise. 

At level of noise 0, RTs were shortest for natural hue formats (NC and NG), and 
longest for artificial hue conditions (FC and FG). As the levels of noise increased, RTs for 
the achromatic conditions (NG and FG), became slower compared to the chromatic 
conditions (NC and FC). The difference in RTs between chromatic and achromatic stimuli 
increased as noise increased, reaching the longest RTs at the highest level of noise as it 
was stated in the third hypothesis. Also, although FC condition had the longest RT at 
level of noise 0, as noise increased its performance improved and it showed the second 
best result at level of noise 2, as it was stated in the fourth hypothesis. 

There was a great similarity between RTs and error rates results. Figure 12 
illustrates how NC error rates were smaller for each level of noise compared to the other 
color formats, as it was stated in the first hypothesis. At level of noise 0, error rates were 
very similar for NC and NG formats. FC error rates were almost the same as NC error 
rates, although for levels of noise 1 and 2, FC error rates were more similar to the 
achromatic formats (NG and FG), and its differences with NC stimuli increased as the 
level of noise increased, as was predicted by the second hypothesis. 
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Both figures show larger RTs and error rates for the achromatic conditions and 
intermediate results for the FC conditions in each level of noise, as was predicted by the 
third and fourth hypotheses. 

Using the same results of the experiment, this study conducted the measurement of 
the advantage of using natural color versus false color, for different levels of noise. 

Because changes of hue also entailed changes of luminance within stimulus images, a 
direct comparison of RTs to NC and FC images cannot indicate effects of natural 
chromatic information. In order to assess the benefit of the use of color, therefore. Tables 
5 and 6 express the differences in RTs and ERs, between achromatic and chromatic 
natural conditions (NG-NC) and between achromatic and chromatic artificial conditions 
(FG-FC) at each level of noise. These values will be referred to as natural color benefit 
and false color benefit. If there was any benefit originated by the use of color, chromatic 
conditions should have obtained better results than their gray scale counterparts and this 
advantage should have increased with increasing levels of noise. Were the benefits of 
color rendering the exclusive result of facilitated image segmentation, furthermore, then 
effects of natural and false color should have been similar across all levels of noise. 
Conversely, was natural color useful in accessing the stored information necessary for 
naming stimulus items, natural color benefit should have exceeded the benefits of false 
color, particularly at high levels of image noise where information about the stimulus 
items’ shapes was most severely degraded. Also, these differences are shown in Fig. 13 
and Fig. 14. 
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Figure 1 3 shows almost no advantage at level of noise 1 for natural conditions and 
a small, although non-significant, disadvantage for artificial conditions. As noise 
increases, color benefits increase too, with the largest advantage for the largest level of 
noise and very similar advantages both for natural and for artificial conditions. The use 
of either natural or artificial color seems to be similarly helpful regarding RTs in a 
recognition task, although NC RTs are always faster than their FC counterparts. 





Natural 


Artificial 


Level 0 


4.48 


-17.06 


Level 1 


30.89 


29.95 


Level 2 


75.78 


82.04 



Table 5: Mean reaction time differences (msec) for each level of noise and 

hue. 





Natural 


Artificial 


Level 0 


0.81 


1.62 


Level 1 


1.42 


0.81 


Level 2 


5.08 


2.23 



Table 6: Mean error rate differences (%) for each level of noise and hue. 
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- Natural | 
-Artificial 



Figure 13: Mean reaction times color advantage (msec) for each level of noise and 

hue. 



Figure 14 illustrates how for natural conditions, color benefit for error rates 
increased with the level of noise, but artificial conditions did not show a similar increase 
of benefit in error rates when the level of noise increased, although these differences were 




Figure 14: Mean error rates color advantage (%) for each level of noise and hue. 
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non-significant. Nevertheless, there always existed certain benefit in accuracy for each 
level of noise and each color condition. 

To determine the appropriate statistical method to be used in the analysis of the 
results of this experiment, normality tests were conducted, using histograms and normal 
QQ-plot of residuals as diagnostic plots. These tests showed that the data failed to follow 
the properties of normality. In order to satisfy the assumption of normality, power 
transformations were applied to the RT and error rates data. RTs on trials on which 
participants responded incorrectly were treated as missing values. Repeated measures 
analyses of variance (ANOVAs) were performed on the transformed data, both for RTs 
and for error rates with a significance level of 0.01 . Although the analyses were 
performed on 1/RT and squared root of ER, the terms RT and error rates are used for 
convenience throughout this study. Mean RTs in milliseconds (msec) and mean error 
rates in percent for each of the participants were calculated fi’om individual performances 
for each condition. When RT and error rates means are reported in the text, these are 
untransformed data. 

The analysis, with participants as a random variable, was a 2X2X3 X2 repeated 
measures ANOVA, with the independent variables Hue (natural, false). Color (chromatic, 
achromatic). Noise (0, 50, 100), and Block (1,2). The first three independent variables 
were repeated within subjects. The same ANOVA that was used for RTs was also used 
for error rates analysis. 

The a priori hypotheses and some interesting interactions can be explored using 
univariate analysis on RT and error rates separately. ANOVA on the dependent variable 
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RT, showed significant main effects for; Color; F( 1,5674) = 20.7366, p < 0.01; Hue; 
F(l,5674) = 36.0931, p < 0.01, and Noise; F(2,5674) = 141.0512, u < 0 01. Participants 
responded significantly faster when images were chromatic rather than achromatic, when 
hue was natural rather than artificial and when images were less degraded by Gaussian 
noise. Mean RT for chromatic images was 978.94 msec (Std. Dev. 122.4 msec) and for 
achromatic images was 1013.28 msec (Std. Dev. 131.1 msec). Mean RT for natural 
images was 973.68 msec (Std dev 120.6 msec) and for artificial images was 1018.54 msec 
(Std. Dev. 131.1 msec). Mean RTs for the three different levels of noise were; Level 0; 
935.98 msec (Std. Dev. 105.4 msec). Level 1; 973.24 msec (Std. Dev. 106.4 msec), and 
Level 2; 1079.1 1 msec (Std. Dev. 124.8 msec). The ANOVA results and the above values 
clearly support a significant difference in mean RTs across color, hue and noise conditions. 

Similar results were obtained for the dependent variable ER. ANOVA showed 
also significant main effects for; Color; F( 1,285) = 17.5616, p < 0.01; Hue; F( 1,285) = 
16.9579, p < 0.01; and Noise; F(2,285) = 54.6472, p < 0.01. Mean error rates for 
chromatic images was 2.87 (Std. Dev. 3.58) and for monochromatic images was 4.86 (Std. 
Dev. 4.89). Mean error rates for natural images was 2.91 (Std dev 3.76) and for artificial 
images was 4.82 (Std. Dev. 4.77). Mean error rates for the three different levels of noise 
were; Level 0; 1.11 (Std. Dev. 1.76), Level 1; 2.99 (Std. Dev. 2.71), and Level 2: 7.50 
(Std. Dev. 5.11). The ANOVA results and the above values clearly support a significant 
difference in mean error rates across color, hue and noise conditions. Participants were not 
only faster but also more accurate when chromatic, natural and non-degraded images were 
shown to them 
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Once the significance of the main effects was tested, the factorial interactions were 
analyzed. The next set of figures was constructed to assist in analyzing factorial 
interactions for both dependent variables. Figure 15 and Figure 16 illustrate the Hue by 
Color interaction. The lines joining mean RTs and error rates for the same Color level are 
roughly parallel across the two levels of Hue. Apparently, there is no interaction between 
these two factors at the 1% significance level. ANOVA yields these results; Dependent 
variable RT F(l,5674) = 4.5517, £.= 0.0329; dependent variable ER: F(l,285) = 1. 1320, p 
= 0.2882, confirming the assumption derived from the graph inspection. 

The non-significance of this interaction for both dependent variables shows that 
although participants were faster and more accurate with chromatic images compared to 
their achromatic counterparts, these differences in RTs and error rates were similar for 
both natural and false hue. It seems that the advantage of NC images over NG images 
(derived from the presence of color) is similar to the advantage of FC over FG images. 
These results suggest that false color is not interfering with recognition; otherwise the 
difference between natural conditions (NG -NC) compared to the difference between 
artificial conditions (FG-FC) should have been significant. 

Figure 17 and Figure 18 illustrate the Noise by Color interaction. Figure 17 shows 
how as the level of noise increases, the differences in RTs for each level of noise increase 
too, with faster RTs and greater accuracy for the chromatic images. Apparently there is 
interaction between these two factors for the dependent variable RT. ANOVA yields these 
results: Dependent variable RT F(2,5674) = 9.4622, 0.01; dependent variable ER; 

F(2,285) = 1.4343, p = 0.2399, therefore just the interaction for RT is significant. 
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yields these results: Dependent variable RT F(2,5674) = 9.4622, p_< 0.01; dependent 
variable ER; F(2,285) = 1.4343, p = 0.2399, therefore just the interaction for RT is 
significant. 




Figure 15: Mean reaction times (msec) for hue and color conditions. 




Figure 16; Mean error rates (%) for hue and color conditions. 
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Figure 17: Mean reaction times (msec) for noise and color conditions. 




Figure 18: Mean error rates (%) for noise and color conditions. 
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The significance of this interaction for the dependent variable RT shows that participants 
were not only faster with chromatic images compared to their achromatic counterparts but 
also that the difference between chromatic and achromatic stimuli increased as the levels 
of noise increased. These results suggest that color is playing a role in object recognition 
speeding the identification of the stimuli when their shape is degraded and that the more 
degraded the objects are, the more helpful color is. As noted above, the absence of an 
interaction between level of hue (natural or false) and level of color (chromatic or 
achromatic) indicates that this effect is the result of facilitated image segmentation. 

Figure 19 and Figure 20 illustrate the Noise by Hue interaction. Figure 19 shows 
how the lines that represent mean RTs both for natural and false images across the three 
levels of noise, remain parallel to each other. Apparently there is no interaction between 
these two factors for the variable RT.Figure 20 shows how these lines slightly diverge for 
increasing levels of noise, indicating a possible interaction between these two factors for 
the dependent variable ER. ANOVA yields these results; Dependent variable RT 
F(2,5674) = 0.3262, p_= 0.7217; dependent variable ER; F(2,285) = 2.7447, p = 0.0659, 
indicating no significance of the hue by noise interaction for any of the dependent variables 
RT or ER. The non-significance of this interaction for both dependent variables RT and 
error rates shows that although participants were faster and more accurate with natural 
hue images than with their artificial counterparts, the difference between these two 
formats did not increase with increasing levels of noise. These results suggest that false 
color is not interfering with recognition as the level of noise increases, in a similar way as 
it was shown for the Hue and Color interaction. 
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Figure 19: Mean reaction times (msec) for noise and hue conditions. 




Figure 20: Mean error rates (%) for noise and hue conditions 
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Factorial three-way interaction for both dependent measures resulted non- 
significant according to ANOVA results: dependent variable RT: F(2,5674) = 0. 1944, p = 
0.8233; dependent variable ER; F(2,285) = 1.2618, p = 0.2847. Based on these results, 
the second a priori null hypothesis (no interaction of NC and FC conditions) cannot be 
rejected. The non-significance of this interaction shows that the difference between 
natural hue images and their gray scale counterparts (NG-NC) and between false hue 
images and their respective achromatic counterparts (FG-FC) does not change 
significantly when the levels of noise change. These results suggest that both the 
beneficial effects of natural color and false color remain similar for increasing levels of 
noise. 

Experiment data analysis showed that RTs for color stimuli were faster compared 
to gray scale images and that this effect increased when the level of noise increased. These 
results suggest that both natural color and false color conditions might be beneficial in 
object recognition. The advantage of using natural color seems to be similar to the 
advantage that is obtained when artificial color is used. Therefore false color does not 
seem to be disruptive in recognition tasks. Both natural and false color stimuli are similarly 
helpful at different levels of noise, such that even for high degradation levels false color 
remains non-disruptive in object recognition. All these results also suggest that participants 
are not using color to recognize the objects in a top-down process. They are just fulfilling 
a bottom-up process using color for image segmentation, without any effect of the level of 
color (natural or false) or of the level of noise. If they were using stored knowledge of 
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color to recognize objects, the advantage of using natural color should be larger than the 
advantage obtained from the use of false color, and this is not the case. 

It should be recalled that FC images were obtained by means of hue manipulation 
of the original NC images. The significant difference of performance by the participants at 
level of noise 0, when dealing with these images, could have been originated by two kinds 
of reasons: i) false color is disruptive in object naming tasks, based on incongruency with 
the stored knowledge of color; ii) changes in luminance with respect to the NC images, 
originated during hue manipulation. All achromatic stimuli, both with natural or false hue, 
were obtained from their chromatic counterparts without introducing any change in 
luminance during the transformation. Luminance of NC stimuli is the same as the luminance 
of NG stimuli. For the same reason, luminance of FC stimuli is the same as the luminance 
of the FG stimuh. Therefore, differences in responses between NG and NC stimuli or 
between FG and FC stimuli, are due just to changes in color. Also, ANOVA results for 
hue, color and noise effects showed that false color was not disruptive during the naming 
task conducted in this experiment. Thus, differences in responses between NC and FC 
images are due just to changes in luminance. 

NG images achieve better performances than their FG counterparts for all different 
levels of noise and for dependent variable RT. The results of these two chromatic 
conditions were expected to be similar. In this case, these different results cannot be 
explained based on differences in color, given that both conditions are achromatic. These 
diverging results should have been originated then by changes in luminance, possibly 
introduced when hue was manipulated, based on the similar value of the differences with 
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their chromatic counterparts (NG-NC vs. FG-FC), both for RTs and error rates, and based 
also on the fact that gray scale images were obtained from manipulation of their respective 
chromatic counterparts, just eliminating color. Therefore, differences between NC and FC 
stimuli for RTs are due just to changes in luminance. 

There is a possibility that could explain how these manipulations in hue could 
affect the luminance of the image. Measures of luminance for NC and FC stimuli showed 
that luminance for some images increased and for others decreased when the change of 
hue was conducted. Although the mean luminance of all the images for each format did 
not change (at level of noise 0 mean luminance for NC images was 150.06; Sdev 14.86 
and for FC was 149.46; Sdev 13.79), changes in luminance could have affected each 
image in a different way, with luminance increasing in some places of the image and 
decreasing in others. This could have left mean luminance for each whole image 
unchanged but would have changed the contrast within the images. Poorer contrast in the 
vicinity of the object’s contour could have made object recognition more difficult to 
achieve. 

Paired comparisons using Tukey’s method were conducted at each level of noise, 
and for both dependent variables RT and error rates, among the different color conditions. 
These results are represented graphically in Table 7. Underlined pairs are non-significant. 
The numeric results of these comparisons can be seen in Tables 8 and 9. These tables 
show how, at each level of noise, participants were faster and more accurate with NC 
stimuli than with any other condition. 
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Based on these results the first null hypothesis can be rejected. Differences in 
RTs between NC and FC stimuli resulted significant at level of noise 0 but there were 
non-significant at other levels of noise and the difference between them did not increase 
as the levels of noise increased so, the second null hypothesis cannot be rejected. The 
third and fourth null hypotheses can be rejected based on the facts that gray scale images 
achieved the longest RTs and greatest error rates at each level of noise, and the FC 
images achieved intermediate results, as it was hypothesized. For the dependent variable 
error rates, just NC stimuli resulted significantly different from FG stimuli at level of 
noise 0, and level of noise 2. 



RTs Paired Comparisons 



Noise Level 0 


NC 


NG 


FG 


FC 


Noise Level 1 


NC 


NG 


FC 


FG 


Noise Level 2 


NC 


FC 


NG 


FG 



ERs Paired Comparisons 



Noise Level 0 


NC 


FC 


NG 


FG 


Noise Level 1 


NC 


NG 


FC 


FG 


Noise Level 2 


NC 


NG 


FC 


FG 



Table 7: Tukey’s Method Paired Comparisons. 
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Noise level 0 


FC 


FG 


NG 


NC 


FC 


X 


0.000019 


0.000061 (SIG) 


0.000066 (SIG) 


FG 


0.000019 


X 


0.000042 (SIG) 


0.000047 (SIG) 


NG 


0.000061 (SIG) 


0.000042 (SIG) 


X 


0.000005 


NC 


0.000066 (SIG) 


0.000047 (SIG) 


0.000005 


X 


W=0.000041 


Noise level 1 


FG 


FC 


NG 


NC 


FG 


X 


0.000031 


0.000036 


0.000070 (SIG) 


FC 


0.000031 


X 


0.000005 


0.000039 


NG 


0.000036 


0.000005 


X 


0.000034 


NC 


0.000070 (SIG) 


0.000039 


0.000034 


X 


W=0.000043 


Noise level 2 


FG 


NG 


FC 


NC 


FG 


X 


0.000044 


0.000067 (SIG) 


0.000112 (SIG) 


NG 


0.000044 


X 


0.000023 


0.000068 (SIG) 


FC 


0.000067 (SIG) 


0.000023 


X 


0.000045 


NC 


0.000112 (SIG) 


0.000068 (SIG) 


0.000045 


X 


W=0.000048 1 



Table 8: Mean reaction times paired comparisons for each level of noise and color 
format (Transformed data). 



Noise level 0 


FG 


NG 


FC 


NC 


FG 


X 


0.390 


0.713 


0.856 (SIG) 


NG 


0.390 


X 


0.323 


0.466 


FC 


0.713 


0.323 


X 


0.143 


NC 


0.856 (SIG) 


0.466 


0.143 


X 


W=0.829 


Noise level 1 


FG 


FC 


NG 


NC 


FG 


X 


0.212 


0.269 


0.739 


FC 


0.212 


X 


0.057 


0.527 


NG 


0.269 


0.057 


X 


0.470 


NC 


0.739 


0.527 


0.470 


X 


W=1.134 


Noise level 2 


FG 


FC 


NG 


NC 


FG 


X 


0.363 


0.394 


1.501 (SIG) 


FC 


0.363 


X 


0.031 


1.138 


NG 


0.394 


0.031 


X 


1.107 


NC 


1.501 (SIG) 


1.138 


1.107 


X 


W= 1.290 



Table 9: Mean error rates paired comparisons for each level of noise and color 
format (Transformed data). 
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In order to investigate a possible learning effect caused by the use of the first 
session of the experiment as training for the second one, the independent variable Block 
was included in the model, with two levels. Sessions 1 and 2, looking for a significant 
difference between the results of both sessions. A greater learning effect for the FC 
conditions could have obscured the assumed detrimental effects of FC in object 
recognition by decreasing RTs and error rates during the second session. This effect could 
be based on the knowledge achieved by the participants during the first session of the 
experiment. Main effect of Block and interactions with Hue and Color were therefore 
analyzed for both RTs and error rates. ANOVA yielded these results; significant main 
effect for Block, on the dependent variable RT: F(l,5674) = 67.9391, p_< 0.01; and on 
the dependent variable ER; F(l,285) = 15.3184, p < 0.01. None of the interactions were 
significant for any of the dependent variables. These were the results for the interactions; 
Hue by Block interaction for RT; F(l,5674) = 0.6212, p= 0.4306; for error rate F(l,285) 

= 0.00263, p= 0.9591. Color by Block interaction for RT; F(l,5674) = 0.0653, p = 
0.7983; for error rate F(l,285) = 0.5874, p =0.4441. These results suggest that 
participant responses were faster and more accurate during the second session, possibly 
caused by a learning effect during the first session of the experiment. But they were not 
significantly faster at any specific color condition, therefore a greater learning effect, not 
just for FC but for any other particular color condition, could not be proven. Learning 
effect was similar for every color condition. 
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IV. CONCLUSIONS 



This experiment examined the role of natural and false color in an object 
recognition task, with degraded and non-degraded images, focusing on improving night 
vision devices that employ the new technology of color fusion displays. 

Four hypotheses were stated in the introduction that summarized several 
assumptions based on previous research about the role of color in object recognition. The 
experimental design was employed to explore dependent measures as reaction time and 
accuracy in object recognition, critical factors when accomplishing military missions in 
which night vision devices are involved. 

The results and discussion presented in the previous chapter supported rejecting 
all but the second null hypothesis. First, third and fourth null hypotheses were rejected, 
based on data analysis that showed how natural color images achieved the best 
performance at every level of image degradation; achromatic images achieved longest 
RTs and largest error rates, and false color stimuli reached an intermediate level of 
performance between these two groups of stimuli. There was a failure to reject that 
differences in RTs between natural color images and their false color counterparts 
increased for increasing levels of image degradation. These results are summarized in 
Table 10. 

Data analysis suggest that differences in performance between natural and false 
hue stimuli were due to differences in luminance and not to chromatic differences in such 
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Null Hypothesis 


Result 


Conclusion 


No differences in RTs or ERs 
between natural color stimuli 
and other color formats, across 
all levels of noise 


Reject Null 
Hypothesis 


Shorter RTs and smaller ERs 
within natural color stimuli 
across all levels of noise 


No increasing differences in RTs 
or ERs between natural color 
and false color stimuli for 
increasing levels of noise 


NOT reject 
Null 

Hypothesis 


(Null hypothesis) 


No differences in RTs or ERs 
between chromatic and 
achromatic stimuli across all 
levels of noise 


Reject Null 
Hypothesis 


Longest RTs and greatest ERs 
within the achromatic stimuli 
across all levels of noise 


No differences in RTs or ERs 
between natural hue and false 
hue stimuli across all levels of 
noise. 


Reject Null 
Hypothesis 


Intermediate results for false 
hue stimuli across all levels of 
noise. 



Table 10: Summary of the results. 



a way that if two images (NC and FC conditions) were matched for luminance, it should 
be inconsequential whether the hues were natural or false. 

As a result of the analysis conducted trying to assess the benefit of using color in 
object recognition, it can be concluded that both natural and false hue conditions resulted 
equally beneficial in the task accomplished during the experiment. There was no 
evidence of false color as a disruptive factor during this task, and both natural and false 
hue were similarly useful at different levels of image degradation. Thus, results indicate 
that participants conducted a bottom-up process during the object recognition task, 
making use of color (natural or false) only to achieve image segmentation. These 
findings are consistent with Wurm & Legge (1993), and Biederman & Ju (1988) views 
that primary access to object recognition uses structural (geometrical) representation of 
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representation is in part generated by the presence of color. Participants did not use color 
to access stored knowledge of the object’s psychological representation. If participants 
were taking advantage of natural color to achieve object recognition, the benefit of natural 
color should have been larger than the benefit of false color, because they would be able to 
fulfill a top-down and bottom-up recognition processes simultaneously. 

More research must still be done in the field of human nighttime visual 
performance, based on the fact already stated that the benefits of integrating synthetic 
color to fused imagery is dependent on the color algorithm used, the visual task 
performed, and scene content (Steele & Perconti, 1997). Future research should study the 
benefits of using false color in each of these scenarios. 

The results of this study give an indication that false color may be useful in future 
color fusion devices based on its facilitation of image segmentation with shape degraded 
images. Although this study was far fi-om covering all different scenarios that may appear 
during a nighttime military operation, it shows as plausible to consider the use of synthetic 
color in the development of new military night vision devices. 
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